Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(stats): split off transmission to RawSender, implement batching and queueing support, add streamlined prefix and suffix support #6267

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from

Conversation

kwvg
Copy link
Collaborator

@kwvg kwvg commented Sep 11, 2024

Motivation

This pull request achieves the goal originally set out in dash#5167, to migrate the base of our Statsd client implementation to one that is actively maintained. Statoshi (source) utilizes talebook/statsd-client-cpp, which in turn is inherited by Dash.

As Statsd is the only cross-platform reporting mechanism available (USDT requires a Linux host with superuser privileges and RPCs don't provide as much flexibility as desired), emphasis is placed on using Statsd as the primary way for the Dash daemon to report metrics related to node and network health.

As part of maintaining our Statsd client, this PR aims to migrate the base of our implementation to vthiery/cpp-statsd-client, a streamlined implementation that embraces C++11 with a thread-safe implementation of queueing and batching, which reduces the number of packets used to transmit stats.

These capabilities are optional and users can opt not to use them by setting -statsduration=0 to disable queueing (which will also disable batching) or -statsbatchsize=0, which will disable batching (but will not disable queueing unless requested explicitly).

Additional Information

  • Dependent on refactor(stats): modernize statsd::StatsdClient, make global unique_ptr #5167
  • RawSender (and by extension, RawMessage) strive to remain as unopinionated as possible, moving the responsibility to construct valid Statsd messages onto StatsdClient. This is to ensure that RawSender can be reused down the line or independently extended without impacting the Statsd client.
    • RawMessage exists to provide extensions to std::vector<uint8_t> that make it easier to abstract away strings while also implementing some of its semantics like append() (and its alias, +=).
  • InitStatsClient() was introduced to keep StatsdClient indifferent to how arguments are obtained and sanitized before they're supplied to the constructor. This is to keep it indifferent to all the backwards-compatibility code for deprecated arguments still work.
  • When constructing the Statsd message, we can use %f without having to specify a precision as tinyformat automatically assumes a precision of 6 (source) and problems don't seem to be observed when using %f with integers (source).
    • As a guardrail, there is a static_assert to ensure that a specialization of send() involving a non-arithmetic type will raise alarm when compiling (source).

Breaking changes

  • -statsenabled (replaced with specifying -statshost), -statshostname (replaced by -statssuffix) and -statsns (replaced by -statsprefix) have been deprecated and will be removed in a future release.

Checklist:

  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added or updated relevant unit/integration/functional/e2e tests (note: N/A)
  • I have made corresponding changes to the documentation
  • I have assigned this pull request to a milestone (for repository code-owners and collaborators only)

@kwvg kwvg added this to the 21.2 milestone Sep 11, 2024
@kwvg kwvg force-pushed the statsd2 branch 2 times, most recently from 1b1c768 to 1b87c07 Compare September 11, 2024 12:18
PastaPastaPasta added a commit that referenced this pull request Sep 11, 2024
4faf35f chore: add `stats` as a pull request header scope (Kittywhiskers Van Gogh)

Pull request description:

  ## Additional Information

  Would allow tagging #5167, #6267 and #6237 as `feat(stats)`.

  ## Breaking Changes

  None.

  ## Checklist:

  - [x] I have performed a self-review of my own code **(note: N/A)**
  - [x] I have commented my code, particularly in hard-to-understand areas **(note: N/A)**
  - [x] I have added or updated relevant unit/integration/functional/e2e tests **(note: N/A)**
  - [x] I have made corresponding changes to the documentation **(note: N/A)**
  - [x] I have assigned this pull request to a milestone _(for repository code-owners and collaborators only)_ **(note: N/A)**

ACKs for top commit:
  PastaPastaPasta:
    utACK 4faf35f
  UdjinM6:
    utACK 4faf35f
  thephez:
    utACK 4faf35f

Tree-SHA512: 25cc28792351852b9e920d980b8814d93274bc53d22ce63fb1a5bf32821b10902d22b384a408e1c4a7b97239e53e235c45ac008d0f82e1afe5d6071b392acb47
Copy link

This pull request has conflicts, please rebase.

@UdjinM6
Copy link

UdjinM6 commented Sep 12, 2024

Should update test/util/data/non-backported.txt

diff --git a/test/util/data/non-backported.txt b/test/util/data/non-backported.txt
index 150bbb70c1..7684074874 100644
--- a/test/util/data/non-backported.txt
+++ b/test/util/data/non-backported.txt
@@ -29,7 +29,8 @@ src/rpc/quorums.cpp
 src/saltedhasher.*
 src/spork.*
 src/stacktraces.*
-src/statsd_client.*
+src/stats/*.cpp
+src/stats/*.h
 src/test/block_reward_reallocation_tests.cpp
 src/test/bls_tests.cpp
 src/test/dip0020opcodes_tests.cpp

@UdjinM6
Copy link

UdjinM6 commented Sep 12, 2024

not sure if can actually do d42c88f

develop:
network.terahashesPerSecond:0.000000|g
network.petahashesPerSecond:0.000000|g
network.exahashesPerSecond:0.000000|g

this PR:
network.terahashesPerSecond:1.03185e-07|g
network.petahashesPerSecond:1.03185e-10|g
network.exahashesPerSecond:1.03185e-13|g

@UdjinM6
Copy link

UdjinM6 commented Sep 12, 2024

old ideas

could probably make it work with smth like

diff --git a/src/stats/client.cpp b/src/stats/client.cpp
index cdcf36d281..b536a44db3 100644
--- a/src/stats/client.cpp
+++ b/src/stats/client.cpp
@@ -157,6 +157,8 @@ bool StatsdClient::send(const std::string& key, T1 value, const std::string& typ
         return true;
     }
 
+    if (value < EPSILON) value = 0;
+
     // Construct the message and if our message isn't always-send, report the sample rate
     std::string buf{strprintf("%s%s%s:%s|%s", m_prefix, key, m_suffix, ToString(value), type)};
     if (frequency < 1.f) {

which results in
network.terahashesPerSecond:0|g
network.petahashesPerSecond:0|g
network.exahashesPerSecond:0|g

a better option would probably be smth like

diff --git a/src/stats/client.cpp b/src/stats/client.cpp
index cdcf36d281..832085c8ef 100644
--- a/src/stats/client.cpp
+++ b/src/stats/client.cpp
@@ -6,7 +6,6 @@
 
 #include <stats/client.h>
 
-#include <util/string.h>
 #include <util/system.h>
 
 #include <cmath>
@@ -25,6 +24,7 @@ static constexpr char STATSD_NS_DELIMITER{'.'};
 static constexpr char STATSD_METRIC_COUNT[]{"c"};
 /** Character used to denote Statsd message type as gauge */
 static constexpr char STATSD_METRIC_GAUGE[]{"g"};
+static constexpr char STATSD_METRIC_GAUGE_DOUBLE[]{"g"};
 /** Characters used to denote Statsd message type as timing */
 static constexpr char STATSD_METRIC_TIMING[]{"ms"};
 } // anonymous namespace
@@ -133,7 +133,7 @@ bool StatsdClient::gauge(const std::string& key, int64_t value, float frequency)
 
 bool StatsdClient::gaugeDouble(const std::string& key, double value, float frequency)
 {
-    return send(key, value, STATSD_METRIC_GAUGE, frequency);
+    return send(key, value, STATSD_METRIC_GAUGE_DOUBLE, frequency);
 }
 
 bool StatsdClient::timing(const std::string& key, uint64_t ms, float frequency)
@@ -158,7 +158,12 @@ bool StatsdClient::send(const std::string& key, T1 value, const std::string& typ
     }
 
     // Construct the message and if our message isn't always-send, report the sample rate
-    std::string buf{strprintf("%s%s%s:%s|%s", m_prefix, key, m_suffix, ToString(value), type)};
+    std::string buf;
+    if (type == STATSD_METRIC_GAUGE_DOUBLE) {
+        buf = strprintf("%s%s%s:%f|%s", m_prefix, key, m_suffix, value, type);
+    } else {
+        buf = strprintf("%s%s%s:%d|%s", m_prefix, key, m_suffix, value, type);
+    }
     if (frequency < 1.f) {
         buf += strprintf("|@%.2f", frequency);
     }

actually, strprintf is smart enough for this to work exactly as before

diff --git a/src/stats/client.cpp b/src/stats/client.cpp
index cdcf36d281..55f5f458c0 100644
--- a/src/stats/client.cpp
+++ b/src/stats/client.cpp
@@ -6,7 +6,6 @@
 
 #include <stats/client.h>
 
-#include <util/string.h>
 #include <util/system.h>
 
 #include <cmath>
@@ -158,7 +157,7 @@ bool StatsdClient::send(const std::string& key, T1 value, const std::string& typ
     }
 
     // Construct the message and if our message isn't always-send, report the sample rate
-    std::string buf{strprintf("%s%s%s:%s|%s", m_prefix, key, m_suffix, ToString(value), type)};
+    std::string buf{strprintf("%s%s%s:%.6f|%s", m_prefix, key, m_suffix, value, type)};
     if (frequency < 1.f) {
         buf += strprintf("|@%.2f", frequency);
     }

%f works too... for all int-s and float.. thanks to tinyformat internal magic :D

src/stats/rawsender.cpp Outdated Show resolved Hide resolved
src/stats/rawsender.h Outdated Show resolved Hide resolved
src/stats/rawsender.h Outdated Show resolved Hide resolved
Copy link

This pull request has conflicts, please rebase.

kwvg and others added 2 commits September 12, 2024 14:53
In upcoming commits, message sending will be split off into a separate
file and stats capabilities will be fleshed out. Prepare for that by
giving it its own directory.

Also get rid of `statsd` namespace, it is entirely unnecessary.

Co-authored-by: UdjinM6 <[email protected]>
@kwvg kwvg changed the title feat(stats): split off transmission to RawSender, implement batching and queueing support, add streamlined suffix and prefix support feat(stats): split off transmission to RawSender, implement batching and queueing support, add streamlined prefix and suffix support Sep 12, 2024
@kwvg kwvg requested a review from UdjinM6 September 12, 2024 17:40
kwvg and others added 3 commits September 13, 2024 06:47
`RawSender` is inspired by `UDPSender` from `vthiery/cpp-statsd-client`
and separating it out of `StatsdClient` is needed to implement queueing
and batching support in upcoming commits.

This is the start of migrating our Statsd codebase to `cpp-statsd-client`.
Copy link

@UdjinM6 UdjinM6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK d9974fa

kwvg and others added 7 commits September 15, 2024 08:33
Additionally, let's remove the key sanitation logic since keys are
hardcoded.

Co-authored-by: UdjinM6 <[email protected]>
- `timing` cannot contain a negative value, change to `uint64_t`
- remove `sendDouble`, use template specialization of `Send` instead,
  that can be reused as `send`
- rename `value` in `count()` to `delta`
- add more comments

`gaugeDouble` has been kept around because utilizing templates for it
would require us to either write template specializations for every kind
of `gauge` value (implicit conversions will not save us from this) or
move all logic to the header, neither option seems desirable.
This is to avoid the odd circumstance where stats don't work because you
specified everything but didn't explicitly enable it. Using
`-statsenabled` has been deprecated and will be removed in the next
major release.
`statshostname` implies that alongside specifying the host, the user
needs to specify a separate "hostname" field in order to connect when
in fact, it is an entirely optional suffix field applied to stats keys.

Rename it to `statssuffix` and deprecate `statshostname`, the latter
will be removed in a subsequent major release.
`statsns`, aside from being an unclear name, in its default behaviour,
doesn't use the namespace delimiter contaminates the root namespace
of the key.

Let's take care of that by deprecating `statsns` with its replacement,
`statsprefix`, that behaves properly.
This marks the completion of our transition from code based on
`talebook/statsd-client-cpp` to code based on `vthiery/cpp-statsd-client`.

Also, we long stopped using `snprintf`, remove it from exclusions list.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants